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ABSTRACT 

A market segmentation analysis was conducted on 
students at a large midwestern urban university using two forms of 
hierarchical cluster analysis on student characteristics: an 
agglomerative procedure using a matching~type association measure and 
a divisive chi-square based automatic interaction detection (CHAID) 
procedure. Data were extracted from institutional records and a 
survey of 872 students concerning satisfaction with 48 different 
campus aspects and importance of 18 goals for college study. Eight 
clusters resulted from a mat ching-type measure/Ward's method 
clustering analysis, while the CHAID procedure resulted in a six 
cluster solution. Comparative analysis revealed that both procedures 
produced differences across only "'j of six satisfaction scales. The 
matching-type measure clusters resulted in significant differences on 
11 of 18 college study priority items compared to only 6 of 18 for 
the CHAID clusters. The study concludes that the matching-type 
measures/Ward's method procedure produced more easily interpretable 
clusters with more corresponding differences in student priorities 
for attending college. The CHAID procedure serves better when there 
is a single outcome of high interest for distinguishing among 
students, in this case general academic satisfaction. The usefulness 
of market segmentation strategies for planning, evaluating, and 
improving academic and student support programs is discussed. 
(Contains 20 references.) (Author/JDD) 
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Segmenting Student Markets With a Student Satisfaction and Priorities Survey 

Victor M. H, Borden 
Indiana University-Purdue University Indianapolis (lUPUI) 

Abstract 

A market segmentation analysis was conducted on students at a large midwestem urban 
university using two forms of hierarchical cluster analysis on student characteristics: an 
agglomerative procedure using a matching-type association measure and a divisive chi-square 
based automatic interaction detection (CHAID). The resuhing segments were compared for their 
ability to distinguish among students according to six satisfaction scales and measures of students* 
priorities for college study derived from a general satisfaction survey. As expected, the CHAID 
clusters discriminated better among students according to their levels of satisfaction, although 
both procedures produced differences across only two of six satisfaction scales. The matching- 
type measure clusters resuhed in significant differences on 1 1 of 18 college study priority items 
compared to only 6 of 18 for the CHAID clusters. Final discussion describes the usefulness of 
market segmentation strategies for planning, evalu?iting, and improving academic and student 
support programs. 

Introduction 

The student population at many universities is becoming increasingly diverse. Recent 
estimates indicate that over one-half of all current college students are older than 25 years and 
over one-half now attend college part-time (Jacoby, 1990). The increasing diversity of students, 
both in tenns of backgrounds and lifestyle, has led to a call for identifying meaningful subgroups 
of students when designing support programs (Borden & Gentemann, 1993). 

Market segmentation strategies provide methods for identifying important subgroups of 
students for needs assessment and program development (Wakstein, 1987). This paper describes 
a study that compares two hierarchical clustering procedures for deriving market segments: one 
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employing matching-type measures and an agglomerative clustering algorithm and another using 
the chi-square based automatic interaction detection (CHAID), a divisive algorithm using binary 
splits on categorical variables. The analyses use as input demographic characteristics of a sample 
of students at a large public midwestem urban university. The validity of the resulting market 
segments is explored using student responses to a general satisfaction survey. 

Market Segmentation in Higher Education 

Bonoma and Shapiro (1983) define market segmentation as a "process of separating a 
market into groups of customers... such that the members of each resulting group are more like the 
members of that group than like members of other segments" (p. 1). They argue that this activity 
provides a better understanding of buying behaviors, the ability to choose market segments that a 
company can best serve, and support for the development of plans to profit from meeting the 
needs of targeted market segments. 

The predominant use of market segmentation strategies in higher education has been in the 
development of college marketing and recruitment programs (Goldgehn, 1989; Grabowski, 1981; 
Merante, 1982). These methods have also been used for other areas of program development 
such as public relations (Grunig, 1990) and career planning and placement (Cowles & Franzak, 
1991). More generally, market segmentation has been suggested as a strategy for understanding 
college choice (Rickman & Green, 1993; MufFo, 1987; Zemsky & Oedel, 1983). 

Cluster Analysis as a Method for Segmenting the Student Market 

Cluster analysis refers to any of a wide variety of numerical procedures that can be used to 
create a classification scheme. Clustering methods have long been recognized for their potential 
usefulness and recent versions of standard statistical packages (e.g., SAS, SPSS, BMDP) include 
a variety of clustering procedures. Conceptually, cluster analysis is easy to understand and well 
suited to market segmentation. However, unlike other multivariate procedures, it is not 
supported by extensive statistical reasoning and its use of various heuristic strategies provides for 
inconsistent resuhs (Aldenderfer & Blashfield, 1984). Despite these limitations, cluster analysis 
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has been used successfully to define market segments as exemplified by Beder's (1986) study of 
adult basic education students. 

Selecting Variables. The most popular forms of cluster analysis are based on measures 
of "similarity" among objects according to some combination of attributes. In the context of 
identifying student market segments, the objects are students and the attributes can be virtually 
any student characteristics including personal or family demographics, levels of academic 
preparation, attitudes and interests, expectations and goals, program of study, college 
performance, etc. Because of the inherent inconsistency among various clustering procedures, 
Aldenderfer and Blashfield (1984) argue that the choice of variables is one of the most critical 
steps in the analysis process and that it should be guided by an explicit theory. 

The selection of variables for a student market segmentation analysis can be guided by 
both theory and practicality. Theories of student involvement in college, like those of Tinto 
(1975) and Astin (1987), propose that success in college is directly related to a student^s ability to 
become involved, psychologically and behaviorally, in the college environment. Students' ability 
to become involved has, in turn, been positively associated with being a full-time student, living 
on campus, working on campus, and other time spent outside class on campus, while lack of 
involvement has been associated with number and strength of ofF-campus commitments, such as 
work and family. 

On the practical side, higher education researchers typically have ready access to certain 
types of student characteristics from college and university operational information systems. 
These include such attributes as prior academic experience along with some measures of 
performance, personal and family demographics, and level of progress and performance in college. 
These are often supplemented by surveys to assess student satisfaction in college, reasons for 
attending college, as well as other aspects of students' lives, such as employment and living 
situation. From among these sources of information, one can select student characteristics that 
have been associated with levels of student involvement in the academic and social milieus of the 
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campus environment. These can include academic background, work and family commitments, 
living situation, and level of progress within college. The specific variables chosen for this study 
are presented in the method section below. 

Choosing a Similarity Mi^asure and Clustering Algorithm. After selecting variables, 
cluster analysis requires the selection of a similarity measure and a clustering algorithm.^ 
Similarity measures can be either measures of distance (geometric distance between points in a 
multi-dimensional space) or similarity (association or correlation coefficients). The type of 
variables chosen for analysis constrains the choice of similarity measure. When using nominal 
variables such as sex, marital status, and race, one must either use measures based on association 
coefficients ("matching-type" measures) or use a technique called chi-square based automatic 
interaction detection (CHAID). 

The use of matching-type measures requires choosing a clustering algorithm. Clustering 
algorithms are generally based on hierarchical or partitioning techniques. Hierarchical algorithms 
can either start with each object occupying its own cluster and then fiise together clusters 
(agglomerative method) or start with one large group and divide the objects into smaller 
subgroups (divisive method). Partitioning techniques require the prior statement of number of 
clusters and then use a predefined criterion for optimizing distances between clusters. The choice 
of clustering algorithm is complex involving questions of expected geometric shapes of the 
resulting clusters, number of clusters present, overiap of clusters, and presence of outliers 
(Aldenderfer & Blashfield, 1984). Hierarchical agglomerative methods, such as the average 
linkage method and Ward's (1963) error sum of squares method, have been most popular in the 



^For a brief treatment of the topic of cluster analysis, see Dillon and Goldstein (1984) and Aldenderfer and 
Blashfield (1984). A more complete treatment of clustering algorithms is available in Hartigan (1975). Sokal and 
Sneath's (1963) book P rinciples of Numerical Taxonomy is often cited as the seminal work in this field. 
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social sciences. Ward's method, which is biased toward tight hyperspherical clusters, is utilized in 
this study to represent a popular hierarchical agglomerative clustering algorithm. 

.Automatic interaction detection (AID) is a method of clustering developed by Sonquist 
and iv (1964) that has become increasingly popular for market segmentation analysis. 
Unlike other forms of cluster analysis, AID uses a criterion variable in addition to classification 
variables, so as to optimize cluster differences. AID is a hierarchical divisive method that uses 
binary splits to divide the sample into successive subgroups based on selecting a predictor variable 
that maximizes reduction in the unexplained variation of the criterion variable. Chi-square based 
automatic interaction detection (CHAID) can be used when at least some of the classification 
variables are measured on a nominal scale. The growing popularity of the CHAID procedure has 
been fostered by its availability in the popular software package SPSS. Lay and Maguire (1983) 
demonstrated the usefulness of the CHAID procedure for estimating qualified inquiries from 
among a market of prospective applicants. 

This study compares the clusters derived from student characteristic data using Ward's 
method with a matcWng-type similarity measure to the clusters derived from the CHAID 
procedure. The resuhs of the clustering procedures will be evaluated by their ability to distinguish 
among students according to their levels of satisfaction with various aspects of their college 
experience and their personal priorities for college study. 

Method 

The data for this study were extracted from institutional records and a survey of 
undergraduate students enrolled in degree programs at a large midwestem urban university. The 
survey instrument included ratings of satisfaction with 48 different aspects of the campus, 
including academics, academic supports, and student support services. Students also related the 
importance of 18 goals for college study including ones relating to academic progress, career 
preparation, career improvement, social and cultural participation, and personal enrichment. 
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Finally, students provided information about their lives outside college, including employment and 
living circumstances. 

Surveys were mailed to a sample of 1700 undergraduate degree-seeking students in the 
spring 1993 semester and responses were received from 872 students (51.3%). The respondents 
were found to represent the student population in terms of ethnicity, major, class level, course 
load status. The sample over-represented women (67% in sampl^; 60% in population) and older 
students (57% aged 25 or older in sample; 45% in population). 

Clustering Characteristics* A matching-type measure of similarity requires that the 
classification variables be converted into binary (0-1) variables. To do this, each characteristic 
(e.g., sex) has to be converted in a series of variables, one for each value^ (e.g , male-0, 1; and 
female-0,1). For the CHAID procedure, as supported by the SPSS software, the classification 
variables can have as many as 3 1 distinct values. The CHAID procedure will create subsets of the 
categories on each variable that maximize between group variation and minimize within group 
variation. Table 1 shows the student characteristics that were used in the clustering procedures 
with the corresponding variables employed for the matching-type measure analysis and the 
corresponding values for the CHAID analysis. 



^It is possible to use a variable for all but one of the values and have the last value represented by zero values for all 
other variables. For the present study, all values were represented by a variable and the all zero value condition 
was reserved for missing values. 
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Table 1. Student Characteristics Used in Cluster Procedures 


student Characteristic 


Clustering Procedure 
Matching-type Measure (N) CHAiD 


Academic Unit 


6 Variables 

unaeciareo [zzz) 
Arts and Sciences (170) 
Nursing (93) 
Engineering & Tech (94) 
Satellite Campus (51) 
MM Winer \^^4} 


17 values including 14 academic schools, 
undeclared majors, continuing studies and a 
sattelite campus 


Chydren at home" 


Single (462) 
Married (307) 

Separ/WidowyOivorce (93) 

2 Variables 
Yes (276) 
No (586) 


3 values-single, married] 
separated^dowed/divorced 

2 values-yesrno 




Networking (178) 
1-19 hours/yvk(117) 
20-35 hours/wk(235) 

+ hntire/uuL' f^A*i\ 

«3D ^ nours/WK \o*fO) 


15 values (0 hrs tTien groups based on 
Increments of five hours-1-5, 6-10...66-70) 
ORDERED 




1-6 hours (309) 
7-11 hours (160) 
12+ hours (404) 


23 vaiues^l through 23 hours oRDEReS 




Freshman (198) 
Sophomore (264) 

li ininr ( 1 
wUIIIUI ^ 1 

Senior (247) 


4 Values (freshman, sophomore, junior, 
senior) 




Yes (512) 
No (349) 


2 Values (yes, no) 




15-21 (234) 
22-25 (210) 
26-34 (235) 
35+ (203) 


22 values 7l 8 or le^, then single year 
increments through 27, 2 year Increments 
through 52, and 53 +) ORDERED 




Female (573) 
Male (286) 


2 Values (Maie, Femaie)" 




Minority (129) 
Not Minority (726) 


7 values (standard EEO categories) 
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Criterion and Validity Variables. 

A principal components factor analysis with vari-max rotation was conducted on the 48 
student satisfaction items. Six different satisfaction subscales were identified. Variables with 
factor loadings greater than 0.50 were included in each of the six sc ^s but the actual scales were 
constructed using unit weights, rather than factor loadings. Table 2 displays the resulting scales 
along with their reliability coefficients (Cronbach*s alpha). The first scale represents a more 
general rating of students' satisfaction with academics and subsequent scales related to more 
specific support areas, such as financial aid and computer availability. These scales were 
employed to compare the results of the two clustering procedures. 

Students indicated their personal priorities for college study by rating each of 18 items as 
being of low, medium, or high importance. Table 3 lists the 18 items that students rated 
organized into the general areas of academic, career-preparation, career-improvement, social and 
cultural participation, and personal enrichment. 

Clustering Procedures 

All clustering procedures were performed using SPSS® for Windows™ Version 6.0 
sofhvare. Matching-type measures using binary data are based on counts of the number of 
attributes that are present or absent among cases. That is, for each possible pairing of subjects, a 
two-by-two matrix is formed with counts of the number of attributes that both subjects have in 
common (both have or both do not have), and the two ways in which one subject has the attribute 
and the other does not. Different distance measures can be calculated depending on which cells 
are included in calculating the association coefficient. For the current analysis, the Jaccard 
measure was used, which excludes the counts for when both subjects do not have the attribute. 
This was chosen to exclude missing values that were coded as zero on all attribute variables for a 
given characteristic (e.g., if sex is missing then male=0 and female=0). The resulting coefficient is 
the number of attributes both subjects have in common over the total number of attributes 
considered. 
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Table 3. Pe rsonal Priorities for College Study: Survey items 
Academic Goals ' 

To increase my knowledge and understanding in an academic field 
To obtain a certificate or degree 
To complete courses necessary to transfer to another college/university 
To increase my grade-point averag e 
Career-Preparation Goals 
To discover career interests 
To formulate long-temi career plans and/or goals 
To prepan? for a new career 
Job or Career-Improvement Goals 
To improve my knowledge, skills, and competencies for my job or career 
To increase my chances for a raise and promotion 
To get a better job 

Social- and Cultural-Participation Goals ~ 

To become actively involved in student life and campus activities 
To increase my participation in cultural and social events 
To meet people 



Personal-Development and Enrichment Goals 
To increase my self-confidence 
To improve my leadership skills 
To improve my ability to get along with others 
To leam skills that will enrich my daily life 
To develop my a bility to be independent, self-reliant, and adaptable 
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The general academic satisfaction scale was chosen as the criterion variable for the 
CHAID procedure and an ordinal analysis was performed to match the scale of this criterion. For 
the CHAID procedure, clustering is performed using one predictor variable at a time. Typically, 
objects are first clustered according to the predictor that accounts for the largest difFerencev on 
the criterion variable. Subsequent clusters are identified by taking the next most significant 
predictor and breaking up the first set of groups according to the values of the second predictor. 
Different predictors may be selected for each cluster formed by the preceding predictor. The user 
can create different solutions by choosing different combinations of predictors during different 
passes. Automatic mode was chosen to let the CHAID procedure build the cluster tree starting 
with the most significant predictor and continuing until no further significant predictors were 
found. 

Results 

Cluster Solutions 

Table 4 shows the eight clusters that resulted fi-om the matcWng-type measureAVard's 
method clustering analysis. Each cluster is identified by the profile of student demographics and is 
described according to the distinguishing features of that profile. For example. Cluster Ml is 
characterized by younger students (83% 18-21 years old compared to 27% of fijll sample), who 
are first generation college students in their families (91% compared to 60% of sample), single 
(97% compared to 54% of sample), and attend college full-time (92% compared to 46% of 
sample). An empty cell within a cluster signifies that the group does not differ significantly fi-om 
the population profile on that characteristic. 
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The first four clusters generally represent full-time student^ who are relatively young, 
single, and have no children. Cluster Ml is distinguished among this group as having the 
youngest students who are first-generation college goers. Cluster M2 includes the non-first- 
generation female students. Cluster M3 has many senior-level students and few minorities. 
Finally, within this group of more "traditional" college students, Cluster M4 contains the majority 
of the sample's minority students who tend to have lower course and work load levels compared 
to the other three clusters. 

The last three clusters contain relatively older students who are married and have children. 
Cluster M6 includes the oldest group of students, almost exclusively females, who work full-time 
and take only one or two courses. Cluster M7 includes many adult learners who are either out of 
work completely or work part-time while maintaining as much as a full-time course load. Cluster 
M8 contains adult students who maintain significant work, family, and school obligations. 

Cluster M5 represents a middle-ground between the first four and last three clusters. Like 
the first four clusters and unlike the last three, this group is not likely to have significant family 
obligations. Unlike the first four clusters, they tend to maintain a part-time course-load while 
working full-time. This group is typical of the full sample, that is. diverse, in terms of class level, 
first generation status, age, sex, and minority status. 

Figure 1 shows the cluster tree that resulted from the CHAID analysis using the general 
academic satisfaction as the criterion variable. Although the criterion is treated as an ordinal 
categorical variable in the procedure. Figure 1 displays the normalized group average for the 
satisfaction scale. Therefore, for the full-sample, the base value is zero and the values for clusters 
represent stu^idard deviation units fi-om the overall mean. 
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The single best predictor of differences students' general academic satisfaction was the 
academic unit in which the student was enrolled. Three clusters of academic units emerged as 
Indicated in Figure I, with the first cluster having the highest average satisfaction ratings and the 
third group the lowest. The second pass of the analysis split the first cluster into two additional , 
groups, the first containing fi-eshanan, sophomore and juniors, and the second containing all 
seniors. A different predictor was identified for the second academic unit cluster. Here the group 
was subdivided according to sex. A third pass found age to be a fiirther significant predictor 
within the male cluster, separating students under 25 years old from those who were 25 or older. 
There were no further predictors of satisfaction among the third academic unit group. 

When all significant predictors had been found, the CHAID procedure resuhed in a six 
cluster solution. Clusters Cl and C2 represent the non-seniors and seniors, respectively, from the 
first set of academic units. Clusters C3 and C4 represent the younger and older males, 
respectively from within the second academic unit group, and Cluster C5 represents the female 
students from the second academic unit group. Finally. Cluster C6 represents all students in the 
third academic unit group. 

Cluster Validity 

To compare cluster solutions, differences among clusters were examined according to the 
overall student satisfaction', and the six satisfaction scales and 18 goal items described earlier. 
Tables 5 show the results of these comparisons for the matching-type measureAVard's method and 
CHAID analyses, respectively. The table includes only those items for which significant 
differences were found for either set of clusters at the p = .05 leveH. 
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'The overall student satisfaction was measured by a single item that asked students "Overall, how satisf.ed ,re you 
with your experiences at this university.- Responses were allowed in one of four categories: Very satisfed'. 
'satisfied', 'dissatisfied', and "veiy dissatisfied'. 

♦The reader can refer back to Tables 2 and 3 for Uie complete set of items. 
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The two different cluster solutions are associated with a number of significant differences 
in student satisfaction and priorities. The CHAID clusters yield larger differences in student 
satisfaction. This is to be expected since the CHAID procedure used the general academic 
satisfaction group as the criterion variable. On the other hand, the matching-type clusters also 
yielded some significant differences in student satisfaction and yielded generally larger differences 
in student priorities, neither of which were used to form the clusters. Each set of cluster solutions 
produced differences in only two of the six satisfaction scales (general academic for both, financial 
aid for CHAID, and course availability for Matching-Type). The Matching-Type procedure 
yielded significant differences on 1 1 of the 18 priority items, while the CHAID analysis yielded 
differences on only 6 of th^^m. 

For the Matching-Type clusters, the differe.nce in student priorities corresponds in 
expected ways with the Cluster composition. For example. Cluster M5, which represented the 
"middle-ground" group is also found in the middle ground with respect to satisfaction and 
priorities. Clusters Ml through M4, which represent the more traditional students, have higher 
priorities for involvement on campus and personal enrichment, while the older student clusters 
show generally lower priorities in these areas. Cluster M7, which includes many out-of-work 
adults, shows a high interest in finding a new career. As a final example, the older working 
students, who are attending school part-time (Cluster M6) do not appear to be as driven by chese 
college study goals. One would expect that these students are less involved in college than others 
who are looking for more specific social, academic, and career gains fi-om their college 
experience. 

The results of the CHAID clustering are not as easily interpreted. The groups with the 
highest level of general academic satisfaction (the criterion variable) tend to rate some critical 
college study priorities relatively low. Specifically, Clusters C2 and C4 are the most satisfied, but 
appear to care less than the others about obtaining a degree or improving their GPA. 
Furthermore, the least satisfied group. Cluster C6, shows average levels of priorities across all 
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items. The young male students in Cluster C3 do appear more interested in the campus social 
climate compared to members of the other clusters. 

Implications 

Market segmentation strategies hold great promise for program planning and evaluation, 
especially at large institutions that serve diverse student populations. It is becoming increasingly 
clear that programs cannot be designed for a typical student when students differ so greatly, nor 
are resources available to make individualized approaches to program development feasible. 

Cluster analytic procedures are useful for identifying market segment for programmatic 
planning, design, and evaluation, but these procedures impose some complex challenges for the 
researcher. There are many choices for measuring similarity between cases and for choosing a 
clustering algorithm. The literature for evaluating cluster methods and solutions is geared more 
toward conceptual issues such as geometric shape and density and less toward conditions of 
applied research. 

The present study set out to compare two specific types of clustering solutions that can be 
used for higher education market segmentation based on common measures of student 
characteristics. Of the two procedures compared in this study, the matching-type 
measuresAVard's method procedure produced more easily interpretable clusters with more 
corresponding differences in student priorities for attending college. The ability to target students 
based on known or knowable demographic characteristics can be very useful to support service 
development or targeted market penetration. 

The CHAID procedure serves better when one has a single outcome of high interest for 
distinguishing among students. In the present study, the CHAID results were more directly 
related to differences in general academic satisfaction. These results were included in an internal 
report on the satisfaction results and led those academic units at the low end of the satisfaction 
scale to ask for further analyses to better understand and work to improve the sources of student 
dissatisfaction. 
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